Skip to content

feat(cnki): add detail extraction and advanced search#1855

Open
spring-peach wants to merge 1 commit into
jackwener:mainfrom
spring-peach:feat/cnki-detail-search
Open

feat(cnki): add detail extraction and advanced search#1855
spring-peach wants to merge 1 commit into
jackwener:mainfrom
spring-peach:feat/cnki-detail-search

Conversation

@spring-peach
Copy link
Copy Markdown

Summary

  • add a cnki/detail command for metadata, abstract, and keyword extraction
  • expand cnki/search with advanced query options, date/type filters, pagination, and optional abstract enrichment
  • add shared CNKI helpers and adapter tests

Tests

  • npm run build
  • npx vitest run --project adapter clis/cnki/search.test.js
  • npm run check:silent-column-drop
  • npm run check:typed-error-lint

Copilot AI review requested due to automatic review settings June 4, 2026 15:51
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds richer CNKI support by introducing a dedicated detail command and expanding the search command with advanced query options plus optional per-result detail extraction.

Changes:

  • Added cnki/detail command and shared helpers for URL normalization, search URL building, and detail-page extraction.
  • Reworked cnki/search to support advanced search expression, field/date/type filters, paging, and optional abstract extraction.
  • Expanded Vitest coverage for command registration and key validation/normalization behaviors; updated CLI manifest accordingly.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
clis/cnki/shared.js New shared URL helpers and in-browser detail-page extractor; adds extractCnkiDetail.
clis/cnki/search.js Major rewrite of CNKI search command: advanced options, pagination handling, optional detail scraping.
clis/cnki/detail.js New CNKI “detail” CLI command built on shared extraction logic.
clis/cnki/search.test.js Adds tests for new shared helpers and new detail command; expands search validations.
cli-manifest.json Registers new cnki/detail command and updates cnki/search metadata/args/columns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread clis/cnki/shared.js
Comment on lines +81 to +86
const stopLabels = [
'\\u6458\\u8981', '\\u5173\\u952e\\u8bcd', '\\u4e13\\u8f91',
'\\u4e13\\u9898', '\\u5206\\u7c7b\\u53f7',
'\\u5728\\u7ebf\\u516c\\u5f00\\u65f6\\u95f4', '\\u57fa\\u91d1',
'\\u4f5c\\u8005', '\\u6765\\u6e90', 'DOI', 'CLC', 'Fund'
];
Comment thread clis/cnki/shared.js
Comment on lines +3 to +10
export function normalizeCnkiUrl(url) {
const raw = String(url || '').trim();
if (!raw) return '';
if (/^https?:\/\//i.test(raw)) return raw;
if (raw.startsWith('//')) return `https:${raw}`;
if (raw.startsWith('/')) return `https://kns.cnki.net${raw}`;
return `https://kns.cnki.net/${raw}`;
}
Comment thread clis/cnki/search.js
Comment on lines +5 to +9
function parseLimit(value, fallback = 10) {
const parsed = Number.parseInt(String(value ?? ''), 10);
if (Number.isNaN(parsed)) return fallback;
return Math.max(0, parsed);
}
Comment thread clis/cnki/search.js
Comment on lines +314 to +316
if (results.length === before) break;
break;
}
Comment thread clis/cnki/search.js
Comment on lines +26 to +71
function parseDocTypes(value) {
const map = {
all: '',
journal: 'YSTT4HG0',
journals: 'YSTT4HG0',
dissertation: 'LSTPFY1C',
dissertations: 'LSTPFY1C',
thesis: 'LSTPFY1C',
degree: 'LSTPFY1C',
conference: 'JUP3MUPD',
conferences: 'JUP3MUPD',
newspaper: 'MPMFIG1A',
newspapers: 'MPMFIG1A',
book: 'EMRPGLPA',
books: 'EMRPGLPA',
standard: 'WQ0UVIAA',
standards: 'WQ0UVIAA',
achievement: 'BLZOG7CK',
achievements: 'BLZOG7CK',
patent: 'VUDIXAIY',
patents: 'VUDIXAIY',
yearbook: 'HHCPM1F8',
yearbooks: 'HHCPM1F8',
ccjd: 'PWFIRAGL',
special: 'NN3FJMUV',
video: 'NLBO1Z6R',
videos: 'NLBO1Z6R',
library: 'T2VC03OH',
ystt4hg0: 'YSTT4HG0',
lstpfy1c: 'LSTPFY1C',
jup3mupd: 'JUP3MUPD',
mpmfig1a: 'MPMFIG1A',
emrpglpa: 'EMRPGLPA',
wq0uviaa: 'WQ0UVIAA',
blzog7ck: 'BLZOG7CK',
vudixaiy: 'VUDIXAIY',
hhcpm1f8: 'HHCPM1F8',
pwfiragl: 'PWFIRAGL',
nn3fjmuv: 'NN3FJMUV',
nlbo1z6r: 'NLBO1Z6R',
t2vc03oh: 'T2VC03OH',
};
const values = normalizeList(value);
if (values.length === 0 || values.includes('all')) return '';
return Array.from(new Set(values.map(item => map[item] || item.toUpperCase()))).join(',');
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants